GPfit: An R Package for Fitting a Gaussian Process Model to Deterministic Simulator Outputs
نویسندگان
چکیده
Gaussian process (GP) models are commonly used statistical metamodels for emulating expensive computer simulators. Fitting a GP model can be numerically unstable if any pair of design points in the input space are close together. Ranjan, Haynes, and Karsten (2011) proposed a computationally stable approach for fitting GP models to deterministic computer simulators. They used a genetic algorithm based approach that is robust but computationally intensive for maximizing the likelihood. This paper implements a slightly modified version of the model proposed by Ranjan et al. (2011) in the R package GPfit. A novel parameterization of the spatial correlation function and a clustering based multistart gradient based optimization algorithm yield robust optimization that is typically faster than the genetic algorithm based approach. We present two examples with R codes to illustrate the usage of the main functions in GPfit. Several test functions are used for performance comparison with the popular R package mlegp. We also use GPfit for a real application, i.e., for emulating the tidal kinetic energy model for the Bay of Fundy, Nova Scotia, Canada. GPfit is free software and distributed under the General Public License and available from the Comprehensive R Archive Network.
منابع مشابه
A Computationally Stable Approach to Gaussian Process Interpolation of Deterministic Computer Simulation Data
For many expensive deterministic computer simulators, the outputs do not have replication error and the desired metamodel (or statistical emulator) is an interpolator of the observed data. Realizations of Gaussian spatial processes (GP) are commonly used to model such simulator outputs. Fitting a GP model to n data points requires the computation of the inverse and determinant of n × n correlat...
متن کاملMultivariate Gaussian Process Emulators With Nonseparable Covariance Structures
Gaussian process regression models or ‘emulators’ have become popular in the statistical analysis of deterministic computer models (simulators), in particular for computationally expensive models where the emulator is used as a fast surrogate. For models with multivariate output, common practice is to specify a separable covariance structure for the Gaussian process. Though computationally conv...
متن کاملA generalized super-efficiency model for ranking extreme efficient DMUs in stochastic DEA
In this current study a generalized super-efficiency model is first proposed for ranking extreme efficient decision making units (DMUs) in stochastic data envelopment analysis (DEA) and then, a deterministic (crisp) equivalent form of the stochastic generalized super-efficiency model is presented. It is shown that this deterministic model can be converted to a quadratic programming model. So fa...
متن کاملLocal Likelihood Estimation for Covariance Functions with Spatially-Varying Parameters: The convoSPAT Package for R
In spite of the interest in and appeal of convolution-based approaches for nonstationary spatial modeling, off-the-shelf software for model fitting does not as of yet exist. Convolution-based models are highly flexible yet notoriously difficult to fit, even with relatively small data sets. The general lack of pre-packaged options for model fitting makes it difficult to compare new methodology i...
متن کاملEfficient optimization of the likelihood function in Gaussian process modelling
Gaussian Process (GP) models are popular statistical surrogates used for emulating computationally expensive computer simulators. The quality of a GP model fit can be assessed by a goodness of fit measure based on optimized likelihood. Finding the global maximum of the likelihood function for a GP model is typically challenging, as the likelihood surface often has multiple local optima, and an ...
متن کامل